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1 51052/JEC/B600 - 

METHODS OF RECORDING VOICE SIGNALS IN A MOBILE SET 

Cross-Reference to Related Application(s): 

5 

This application is a divisional of U.S. Patent Application No. 09/747,932, 
filed December 22, 2000. 

1. Field of the Invention 

The present invention relates generally to telecommunications systems 
10 and methods for recording information during phone calls, and specifically to 

providing methods for recording information to a memory element during a call, in 
which the conversation is recorded in a way to reduce the storage space utilized in 
the memory element. 

2. Background of the Invention 
15 References for IDS inclusion: 

Re. 34,976 (cell phone digital recorder/live conversations) 

6.064.792 (Fox) deferred recording. 
4,495,647 (Burke) Digital voice storage mobile. 
5,519,684 (Iizuka) Digital recorder, multiple tracks. 

2 q 5,995,824 (Whitfield) cell phone vox. 

5.867.793 (Davis) cell phone vox. 

3,936,610 (Schiffman) switching controller - output. 

Mobile phones (Cellular Phones) have become a standard form of 
communication in industrialized countries. Communications with people in local 
2 5 and wide area cell networks is common place. An artifact of this form of modern 
communications is many times it is difficult to hear the voice of a person over a 
mobile phone. This difficulty stems from both technological and environmental 
short comings inhereint in the communication type. 

A mobile phone network is an intricate and complex array of devices. For 

2 g easy reference this disclosure describes a GSM (Global System for Mobile 

Communication) style digital mobile phone system. However the invention herein is 
not limited particularly to this type of system. Generally a GSM is composed of a 
number of Mobile Service Centers (MSC) and an integrated Visitor Location 
Register (VLR) therein. The MSC/VLR areas include a number of Location Areas 

3 5 (LA) which are defined as part of a given MSC/VLR area. Mobile sets (MS), or 

mobile phone subscribers, may freely roam within the coverage area without having 
to send update information to the MSC/VLR area that controls 
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the LA. The cellular network is composed of all these elements and a multitude of 
subscribers, each having a mobile set (MS). 

The wide spread use of mobile phones has produced a variety of different cellular 
networks. Cellular networks within the same region may operate on a different technology 
5 base. Some networks experience technical difficulties in the transmission and reception of 
signal from the MS to the Base Station (BS), the cellular networks reception area for 
receiving and transmitting information to each MS. These technical difficulties include 
interference from any number of signal producing sources (including other subscribers), 
geographical interference, structural interference and the like. The various source can 

10 individually or in combination contribute to poor reception of signal from the BS to MS, or 
for some error in signal from one station to the receiving station (MS/BS) causing the 
transmission to be garbled or difficult to interpret. 

Additionally, the environment where mobile sets are often used in include 
places where a subscriber may not be able to dedicate their full attention to the conversation 

15 on a cell phone (e.g. when driving an automobile) or when the local area the subscriber is in 
makes hearing difficult (as in an areas having a lot of background noise). To assist in this 
problem there have been numerous recent developments to allow a subscriber to record 
information either during or after a conversation, using their MS as either a note pad, 
dictation device, or recorder for conversations. In all cases the use of the cellular phone as a 

20 data storage device produces undesired drawbacks and stretches the limitations of existing 

cellular phone technology. Some cell phones do offer a voice recording feature, either for live 
conversation or as a dictation machine. These recorders use existing technology and 
essentially combine two devices into one casing, instead of integrating a data recording 
system into the mobile set so that the available real estate and power of a mobile set are 

25 optimized. 

The signal processing and data handling of a GSM phone utilizes technology which 
can be adapted for use with the present invention. Conventional GSM mobile phones possess 
an analog to digital signal converter (ADC) audio filter that converts analog microphone 
signal to digital speech samples at a sample rate of 8KHz with 1 3 bits per sample. Voice 
30 encoders may process speech samples in 20 millisecond segments, where each segment is 
compressed into a speech frame of N bits. The actual number of bits per speech frame 
depends on the particular speech encoder used. The speech encoder may provide for half rate 
speech, full rate, enhanced rate or variable rates for adaptive multi-rate speech. The encoder 



4 



compresses speech so that the number of bits per second is minimized while still giving good 
quality speech. Voice encoder frames are interleaved and coded for error correction and 
detection and then transmitted to the base station. 

Downlink voice operations, received through the base station, go through the inverse 
5 process of the voice encoder. A digital to analog (DAC) audio filter performs inverse 

operations of the ADC/audio filter in processing downlink speech frames. A voice activity 
detector (VAD) generates a binary flag (value 0 or 1) indicating whether the subscriber is 
speaking (value 0) or not (value 1). The VAD used in the GSM standard suppresses 
transmission during uplink, producing speech silence intervals to conserve the battery charge. 
10 It is possible to utilize much of the existing signal processing of a GSM compatible phone to 
enhance the data storage capacity of a mobile set, and record conversations. 

It is an object of the present invention to provide a mobile set having sufficient 
memory capacity to store voice conversations. 

Another objective of the present invention is to provide a means for allowing a 
15 subscriber to record a voice conversation in real time for later retrieval. 

Another objective of the present invention is to provide a subscriber with the 
ability to record a conversation and recall the information somewhat contemporaneously in 
the same phone conversation. 

Another object of the present invention is to provide a means to streamline the 
20 manner in which voice information is recorded, making greater effective use of the memory 
element within a subscriber's mobile set. 

It is still a further objective of the present invention to allow subscribers to 
record both voice and data information into a mobile set memory element, and to provide 
accurate time indexing so the messages can be reproduced in the same form that they were 
25 transmitted in. 

It is still a further objective of the present invention to provide a data file 
management system of data stored on the memory element for easy retrieval and sorting, 
either through the use of the MS or another device such as a desktop computer. 

At least one of the present objectives is addressed in the following disclosure. 

30 



SUMMARY OF THE INVENTION 
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The present invention relates to a mobile set having a data recorder system. The 
invention relates to the capture of real time voice conversations on mobile sets ("cellular 
phones"), however the system can also be used to capture multimedia signals, e-mail and data 
transmissions as the technology and capabilities of wireless communications continue to 
5 expand. 

In one embodiment the invention is a method in a mobile set for storing voice 
recordings. The method comprises controlling a voice activity detector (VAD) to identify 
speech containing time frames from at least one uplink and at least one downlink signal and 
recording the speech containing time frames from the uplink and downlink signals such that 

10 each time frame is recorded sequentially with a time stamp for each time frame. In this 
embodiment the mobile set receives two signals forming the two sides of a phone 
conversation. To preserve memory space, the individual time frames are arranged 
sequentially into a single data file and written to memory. 

In a second embodiment of the present invention, a method in a mobile set for 

15 determining record worthy voice time frames is described. The method comprising receiving 
a first signal in a voice activity detector, receiving a second signal in the voice activity 
detector, and comparing the first signal to the second signal. The compared signals must be 
of the same time frame. The signal having the higher voice data content is selected for 
recording. It is also permissible to record both signals if both have sufficient voice data 

20 meeting a predetermined threshold. Normally only one person is speaking, thus the method of 
this embodiment allows the recording of only the person who is speaking. As the 
conversation proceeds, both people may speak, not speak, or only one person speaks. The 
method of the present embodiment compares each uplink and downlink time frame as paired 
events, but separate from the preceding time frame, and independent of the following time 

25 frame. Signals (either uplink or downlink) containing voice data are selected for recording. 
Similar to the previous embodiment, if neither person is speaking, then the corresponding 
time frames will have less than the threshold data required for recording. Those low or no 
data time frames are replaced with placeholders according to a data compression scheme. 
The placeholders are again recorded sequentially with the data containing time frames so the 

30 data file may be played back with a linear representation of the voice time frames. In the 
event both people are speaking simultaneously, the present method selects both data 
containing time frames for recording, but continues to arrange the time frames sequentially. 
Thus while two records may have the same time frame, they are recorded one following the 
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other into memory. On playback, if two recorded time frames have the same time frame, 
they are both converted into analog signals and played simultaneously over the mobile set 
speaker. It is important to note there are a wide variety of combinations for what information 
is recorded and not recorded. Either the higher data containing frame of one channel is 
5 recorded (dropping the lower data containing frame completely, with no place holder) to 
recording all frames sequentially and everything in between. 

In a third embodiment a computer-readable medium containing instructions for 
controlling a mobile set processor to record multimedia signals is described. The method 
comprises controlling a voice activity detector to compare a plurality of voice signals having 

10 identical time stamps and arranging the voice signals such that data containing time stamp 
sequences are sequentially placed into a single data file. The method also includes 
controlling a processor that will identify non-voice signals containing the same time stamp as 
data containing voice time stamp sequences and sequentially recording the data containing 
voice signals and the corresponding time stamp non-voice signals such that both the voice 

1 5 and non-voice data signals will be sequentially recorded into a memory element as a single 
data file. In this embodiment the progression of mobile sets to handle multimedia signals is 
accommodated. The computer readable medium contains instructions for controlling the 
mobile set processor first as a voice activity detector, then a multimedia signal processor. 
Where a mobile set will be capable of handling multimedia signals, the voice activity detector 

20 will determine which time frames of speech (either uplink or downlink) contain data. The 
data containing time frame is then selected for recording into memory. Sequentially the 
processor identifies any companion signal, such as video signal, having the same time stamp 
as the voice signal time frame to be recorded. The non-voice time frame (video frame) is the 
recorded with the corresponding time frame of voice data. In this way the appropriate video 

25 sequence of uplink or downlink video is recorded with the speaker (such as in a two way 
video conference call). If the voice signal is not recorded, the video sequence similarly will 
not be recorded. 

In a forth embodiment, a computer-readable medium containing a data structure for 
stored phone conversations is described. The data structure stores voice signals comprising a 
30 conversation list containing an entry for each of one or more phone conversations. Each 

entry comprising a single string of data records wherein each data record has a file pointer to 
the next record, the last record having an end of file marker. Each record corresponds to at 
least one time stamp of the phone conversation for use in restoring the data structure to a 
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media understandable by a subscriber. The computer-readable medium containing a data 
structure represents the stored data on a memory element created by any of the previous 
embodiments. The data structure is composed of conversations that are stored as computer 
file records. Each file represents a single conversation and is a single string of data broken 
5 into records. The records contain pointers to each following record, allowing the records to 
be stored non-sequentially in the memory element yet preserving the sequential nature of the 
conversation on playback. The last record of a file contains an end of file marker. If the file 
includes multimedia material, such as video conference information, the time frames of video 
corresponding to recorded speech are also saved. When the record is played back only the 

10 corresponding speech and video sequences are played. 

A fifth embodiment describes a method in a mobile set for a subscriber to select data 
to be stored. The method comprises the displaying of a plurality of recording modes while 
indicating a selection means for choosing a recording mode. The subscriber then selects a 
recording mode and the mobile set provides a confirmation signal after a recording mode has 

1 5 been selected. While the methods of the present invention are principally designed to 

streamline the recording of speech time frames without recording non-speech containing time 
frames, a subscriber may manually opt to have all frames of a conversation recorded, or only 
one line recorded. Thus a subscriber may manually select from a command list. In response 
to a selection from the command list, the mobile set will record all speech time frames (both 

20 data and non-data containing frames), record only the uplink signal, only the downlink signal, 
or only data containing frames of either the uplink or downlink signals. Further the command 
list will provide the user with the option of not recording accompanying multimedia time 
frames corresponding to the voice time frames. Or, in response to a different response from 
the command list, record the multimedia time frames independently from the voice time 

25 frames. 

A sixth embodiment of the present invention describes a method in a mobile set for 
replaying recorded conversations. The method comprising displaying a line indicating a data 
structure of recorded conversations and in response to selection of the displayed line, 
replaying a recorded conversation. The command list allowing stored data to be played back 
30 may be available to a subscriber during a conversation, so if a subscriber wishes to replay 
information he may do so while he is still in a conversation. 

Finally a mobile set having a voice recording means for storing voice conversations is 
disclosed. The mobile set of the present invention can record signals received through the 
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mobile set and can playback at least a portion of those signals on the mobile set. The mobile 
set comprises an uplink/downlink switch for selecting speech frames from either a uplink or 
downlink signal, at least one switching logic controller for switching between the uplink and 
downlink signals, a method of file header generation for generating headers for recorded 
speech files, a recorder controlling means for configuring and controlling of a recorder 
operation in one of several modes available to a subscriber and a memory element capable of 
storing the voice recordings. 

BRIEF DESCRITPION OF THE DRAWINGS 
Figure 1 illustrates the relationship of mobile sets and a base station in operation. 
Figure 2 shows the basic logic steps for sequential file recording. 
Figure 3 shows the basic logic for multiple signal processing. 
Figure 4 shows the logic for playback of recorded data. 

DETAILED DESCRIPTION OF THE INVENTION 

Prior to the discussion of the present invention, certain terms used herein convey a 
meaning which extends beyond their ordinary meaning in the field of the present invention. 
For clarification the following definitions are used in this description. 

"Data stream" refers to the information stored into memory and relayed from the 
processor of the mobile set to the memory element. The data stream contains a series of data 
records which are formatted similar to any of a variety of computer files. Each record 
possesses a pointer to the next sequential record, and the last record in the file contains and 
end-of-file marker. "Data stream" refers to a single stored file of information and may 
comprise any number of data records. The "data stream" is composed of compressed data 
containing either digital or analog voice information, or other electronically storable 
information (such as video, e-mail or computer files). 

"Downlink" refers to any signal received by a mobile set regardless of source. 

"Memory" as used herein refers to any media capable of storing information in 
electronic form. Though computers and mobile phones often use flash memory for storing 
information, the present discussion includes either the use of persistent memory (retaining 
information even if no power is supplied to the memory element) and flash memory, having 
the characteristic of not being able to store information without constant power supplied to 
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the memory element. In the discussion of the present invention the term "memory" is used to 
signify either flash memory, or persistent memory. 

"Mobile set" is used to describe any number of portable communications devices, and 
is not restricted to the field of commercially available cell phones. Although "mobile set" 
5 certainly includes cellular phones, it also more generally includes any GSM compatible 
phone, mobile communications phone (such as two way radio, "walkie-talkie", satellite 
phones, etcetera). The use of the term "mobile set" furthermore is not restricted to portable 
communications devices based strictly on speech. The use of "mobile set" in the present 
invention also includes portable communications device which, in addition to being able to 

10 send and receive voice signals, are also capable of sending and receiving data signals of 
various types (such as video (multimedia), e-mail, computer files and non-voice style 
electronic information in general). 

"Playback" refers to the recovery and restoration of data (digital or analog) into a 
media the subscriber can understand. It also requires the correct timely organization of all the 

15 data in the same sequence as originally received. While the data management system of the 
present invention includes the ability to receive and record several types of data streams, the 
playback feature allows the reproduction of all stored data as well as the ability to properly 
assign time codes to non-voice information which may be stored. The nature of the invention 
in several embodiments does not permit "true" playback. That is the playback of the 

20 recorded information is not 100% restored to its original form. Indeed often only half of the 
original data (or less) will be part of the data available for playback. While "true" playback is 
possible, it is not in any way suggested nor required in the present invention. 

"Streamlining" refers to the process by which a processor accesses a variety of 
different data time frames and connects them into a single data stream while preserving the 

25 identity and source of each data record. Streamlining is a process by which multiple data 

types of both voice and non- voice information may be connected accurately into a single data 
string, and recovered later without errors in reproduction of the original various signals. The 
processor in a mobile set performs a number of functions at various times or cycles. The 
processor acts as a frame comparator, determining which time frames are to be forwarded to 

30 the data recorder. As a frame comparator the processor may substitute or delete any 

particular frame. The processor also operates as the voice activity detector (VAD). The 
combined different processes cycles the processor engages in to create the single data stream 
for recording is referred to as "Streamlining." 
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"Subscriber" refers to anyone using a mobile set. 
"Uplink" refers to any transmission of information from the mobile set. 
The present invention relates to a system and method of recording voice conversations 
in using a mobile set. The basic structure of a mobile set used as a portable communications 
5 device is loosely shown in Figure 1 . The mobile set 20 operates in a cell 22, which exists in a 
larger communications network such as a Public Land Mobile Network (PLMN) 10. Within 
each cell 22 are base stations 24 used as receivers and transmitters of the signals used to 
communicate with a mobile set 20. The various signals into and out of the base stations 24 
are controlled through a series of controllers, registries, and routing equipment that makes up 

10 different parts of the PLMN 10. For the purposes of this disclosure, only signals to and from 
the mobile set 20 are considered, and the routing of information and signals through the 
whole of the PLMN are not discussed. Whenever a subscriber uses a mobile set 20 to 
communicate, all uplink signals are transmitted to the base station 24 of the cell 22, and all 
incoming signals come through the base station 24 through the PLMN 10. The exception to 

15 this occurs in radio phones or other communications devices designed to communicate 
directly with each other without the use of a base station. 

All signals received by the mobile set 20, whether from a wire line caller, or another 
mobile subscriber, will be received by a mobile set 20 through the base station 24. Any 
transmissions from the base station 24 to the mobile set 20 are referred to herein as downlink 

20 signals. Any transmissions from the mobile station 20 to the base station 24 are uplink 
signals. Signals transmitted between the base station 24 and the mobile station 20 are 
generally digital signals. It is often the case the mobile subscribers will call each other from 
their mobile sets 20 and those uplink signals go to a base station 24, are processed through 
the PLMN 10 before being re-transmitted to the appropriate receiving mobile set 20. As the 

25 technology and options of mobile sets and base stations (and PLMNs) increases, conference 
calls between multiple mobile subscribers and wire line callers will be possible. In any 
combination of communications from either wireless or wire line subscribers, the present 
invention can successfully record the voice and data signals to, and from, a mobile set. 

To preserve memory space, the present invention describes a method for a mobile set 

30 to storing voice recordings. In its basic form, the method comprises controlling a voice 
activity detector to identify speech containing time frames from at least one uplink and at 
least one downlink signal. Once the speech containing time frames are identified, the speech 



11 

containing time frames are recorded. The speech containing time frames from the uplink and 
downlink signals are recorded sequentially with a time stamp for each time frame. 

For the method of the present invention, a dedicated voice activity detector may be 
used as part of the architecture in the mobile set 20. However it is more common in GSM 
5 compatible phones that part of the design of the GSM mobile set 20 allows the processor 108 
to operate as a voice activity detector during certain operation cycles. Reference to a cycle 
here does not mean a single clock cycle, but rather a series of clock cycles which are required 
to execute a single function in the processor (such as encoding a speech frame, decoding a 
speech frame, or comparing two frames, etcetera). This feature is generally related to the 

10 uplink side for preserving battery life. Thus for the present invention the method may utilize 
the voice activity detector cycles of the processor 108 of a GSM phone and tie in the 
downlink signal into the voice activity detector cycles as an extra series of instructions. Both 
the uplink and downlink signals are paired based on their time frames and recorded as a 
single data stream into memory 1 12. The processor records each time frame of uplink and 

1 5 downlink signal alternating between the two sources. 

Referring now to figure 2, the method sorts received or downlink 102 signals and 
uplink 104 signals in the processor 108. The processor 108 may have a built in memory 
buffer 1 06, or it may be separate as shown. The processor 1 08 alternates between time 
frames of the downlink 102 and uplink 104 signals, arranging them into a single data stream 

20 for recording into memory 1 12. Simultaneously the uplink 104 signal is sent to the antenna 
120 for transmission, and the downlink 102 signal is converted into a form the subscriber can 
understand at either the speaker 124 or display 126. 

Another method comprises receiving both uplink 104 and downlink 102 signals and 
storing them in the processor buffer 106. The uplink 104 and downlink 102 signals are 

25 compared to each other in the voice activity detector cycle. The signals compared to each 
other must have the same time stamp (be of the same time frame). In operation, each time 
frame that is processed through the voice activity detector 108 is assigned a logic value. 
Time frames designated as record worthy (value 1) are recorded while those not record 
worthy (value 0) are dropped from the data stream to be recorded. The dropped data frames 

30 are replaced with a placeholder that permits the playback to accurately reproduce pauses in 
the original conversation. The manner of replacing non-record worthy time frames with 
placeholders may be done by various data compression means and is not per se an inventive 
aspect of the present invention. In this method only half the data of the conversation is 
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recorded. In general conversation, only one person is speaking at a time. To preserve 
memory 1 12 space the method of the present invention distinguishes the speech and non- 
speech time frames and records only the speech containing time frames. The non-speech 
containing time frames are dropped from the data stream that is recorded. Because a 
5 placeholder is inserted into the space of each time frame that is not recorded, the linear time 
relationship between the speech containing time frames is not lost. When the uplink 104 and 
downlink 102 signals for a particular time frame both contain no speech, only one 
placeholder need be inserted into the recording data stream. The placeholder for the non- 
speech containing frames will be restored to non-speech pauses when the data is recovered 

10 for playback. The signals to be recorded are then sent to memory 1 12 while the buffer is 

cleared for the next batch of time frames. Processors generally operate at a much faster cycle 
time than the rate at which uplink and downlink time frames are loaded into the buffer. Thus 
the voice activity detector cycles can clear the buffer of stored speech time frames without 
the buffer becoming full. Once the time frames are selected for recording, they are arranged 

15 into a single data stream by the processor 108. This maximizes space as the data stream can 
now be recorded as a computer file composed of records. Each record has a record pointer 
showing where the next sequential record is. The last record has an end of file marker. The 
file may contain records which contain both voice and non-voice data. 

Thus the voice activity detector cycle looks at both the uplink and downlink time 

20 frames and assigns them a logic value of one or zero. The following example shows the logic 
executed by the voice activity detector cycle- and the processor during a "frame comparator" 
step. The voice activity detector determines if the speech time frame (either uplink or 
downlink source) contains record worthy data. If so the time frame is assigned a high logic 
value (1). If not the frame is given a low logic value (0). Once the speech frames for a given 

25 time are assigned values, they are returned to the buffer 106 for the next processor cycle. The 
processor 108 then retrieves the data from the buffer 106 and sends the high logic value 
frames to the data recorder 1 10 for recording. The low logic value frames are dropped, and 
substituted with a placeholder as previously described (However to further conserve space in 
memory, the place holder for a low logic speech frame may be omitted (except where both 

30 uplink and downlink signals contain no data). The proper sequence of timing for the speech 
frames can be derived from only the high logic frames that are recorded). The processor acts 
as a switching logic controller in determining which time frame to record when sorting 
through the uplink and downlink signals (or various uplink and downlink signals). 
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Example 1 

Ui=l then record 
Ui=0 then drop 
5 Di=l then record 

Di=0 then drop. 

The recorded data stream, after stream lining may look like Ui, Di, U2, D2, etc. ... 

The high logic time frames are recorded into a single data stream to allow for the file 

10 record to be stored as a computer file. Depending on the operation of the mobile set, the 
operation of the voice activity detector cycle and file comparator cycles may be combined 
into just the voice activity detector cycle. The added benefit is high logic signals may be sent 
directly to the data recorder without having to go back into the memory buffer, which reduces 
the power consumption of the operation. 

1 5 The methods described above are also executable when dealing with signals from a 

non-voice source, such as video, text messaging, e-mail or other signal the mobile set is 
capable of receiving. As the abilities of mobile sets expand, and offer additional features to 
subscribers, such as mobile video conference calling, wireless e-mail and web browsing, the 
next generation of mobile set will have a much broader array of data to contend with, 

20 Memory for recoding information in a mobile set will therefore be at a premium. The 

recording of video signals accompanying voice signals (such as in a conference call) may be 
selectively handled so that only the video time sequences corresponding to record worthy 
voice signals are recorded. The uplink and downlink signal paths would similarly be tracked 
so the voice and video of the appropriate source is maintained. 

25 By way of example, if a mobile set is receiving two downlink signals of voice (Di and 

D^), and two downlink signals of video, then only the video time frame corresponding to the 
record worthy voice time frame (when some one is speaking) will be recorded (figure 3). 
Thus the party of the actual phone conversation who is not speaking, is not recorded for 
either voice, or video. The data received goes to the processor as events, and the operation 

30 performed is either the voice activity detection (VAD) or frame comparator (FC). The FC 
cycle executes a logic yielding in data recording of an uplink signal (U n ), downlink signal 
(D n ) or placeholder (P L ). 
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Example 2 
VAD/FC Operation and result 



Event 


Operation 


logic 


Record 


1 


VAD 


D!=l orDp=0 


n/a 


2 


VAD 


D 2 =l or D 2 =0 


n/a 


3 


FC 


If Di^l then V,=l, then record Dj and V,. 


D„V, 


4 


FC 


If D,=0, then Vj=0, then drop D, and V,. 


Pl 


5 


FC 


If D 2 =l then V 2 =l, then record D 2 and V 2 . 


D 2 , V 2 


6 


FC 


If D 2 =0, then V 2 =0, then drop D 2 and V 2 . 


Pl 


7 


VAD 


U,=l orU,=0 


n/a 


8 


FC 


If U i=l, then U 2 =l, then record Uj and U 2 . 


u,,u 2 


9 


FC 


If Ui=0, then U 2 =0, then drop Ui and U 2 . 


Pl 



In this example, the data stream which is recorded is derived in the frame comparator 
cycle, and may appear like Di,Vi, Pl, D 2 , V 2 , Pl, Ui, U2, Pl- Alternatively, if the PL is not 



5 recorded where there is a frame of actual data, the data stream would look like Di,Vj, D 2 , V2, 
Ui,U 2 . 

The execution of these methods originates from a computer-readable medium 
containing instructions for controlling a mobile set processor to record multimedia signals. 

10 The computer-readable medium comprises instructions for controlling a processor (VAD/FC) 
to compare a plurality of voice signals having identical time stamps, and arranging the voice 
signals such that data containing time stamp sequences are sequential in a single data file. 
The computer readable-medium also has instructions for controlling a processor to identify 
non-voice signals containing the same time stamp as data containing voice time stamp 

15 sequences. Then the data containing voice signals and the corresponding time stamp non- 
voice signals are sequentially recorded such that both the voice and non- voice data signals are 
recorded into a memory element as a single data file. However it is not necessary that 
computer files, such as text messages or application data files, be recorded into memory in 
the same manner as voice and video. These files would be stored in whole without any 

20 insertion of placeholders for actual data. An arrangement of multiple data files forms a data 
structure in the memory element. 

The memory element then forms a computer-readable medium containing a data 
structure for storing voice signals. The data structure comprises a conversation list 
containing an entry for each of one or more phone conversations. Each entry comprising a 

25 single string of data records wherein each data record has a file pointer to the next record, the 
last record having an end of file marker. Depending on how a particular mobile set is 
designed to store information, each record will corresponding to one or more time frames of 
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the phone conversation for use in restoring the data structure to a media format 

understandable by a subscriber. 

As described in the method of selecting signals for recording, the data may contain 

voice and multi-media data, with fragments of various types of data strung together into a 
5 single data stream. In the case of a mobile set that also acts as a PDA (Personal Data 

Assistant) the mobile set may also have the ability to record computer files. Files received by 

the mobile set may be stored sequentially, or contain a file pointer in the last record of the file 

that identifies any attachments. 

While the computer-readable medium of the present invention has a default method of 
10 determining what data is "record worthy" a subscriber may alter the default method by 

instructing the mobile set to record information in a variety of other methods. Therefore an 

additional method in a mobile set for selecting data to be stored comprises displaying a 

plurality of recording modes, indicating a selection means for choosing a recording mode; 

and in response to selection of the displaying a plurality of recording modes, a different 
1 5 method of recording is selected. In this manner a subscriber can choose to record all time 

frames of both the uplink and downlink signals, or record only the uplink, or only downlink. 

Where multimedia files are concerned, this option permits the user to preserve memory space 

by ignoring multimedia material except voice. Or the subscriber can turn recording off 

completely. 

20 Recalling stored information (Figure 4) from the data structure involves displaying a 

line indicating a data structure of recorded conversations, and in response to a selection of the 
displayed line, replaying a recorded conversation. In addition to the ability to recall a 
previously recorded message, the mobile set of the present invention would allow a 
subscriber to review recorded conversations using a variety of speed controls, or segment 

25 replay controls (replaying a few seconds of voice where the audio is garbled or difficult to 
distinguish). 

Another feature of the present invention is the ability to recall and playback recorded 
conversations while using the mobile set as a phone. In this manner a subscriber may recall a 
previously recorded conversation (either of the current call, or a previous call) and play it 
30 back for the subscriber, or transmit the recorded data through the uplink signal. 

Finally a mobile set having a voice recording means for storing voice conversations is 
disclosed. The mobile set of the present invention can record signals received through the 
mobile set and can playback at least a portion of those signals on the mobile set. The mobile 
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set comprises an uplink/downlink switch for selecting speech frames from either a uplink or 
downlink signal, at least one switching logic controller for switching between the uplink and 
downlink signals, a method of file header generation for generating headers for recorded 
speech files, a recorder controlling means for configuring and controlling of a recorder 
5 operation in one of several modes available to a subscriber and a memory element capable of 
storing the voice recordings. 



